Artificial Intelligence & 
Al Programming 


3 main course components : 


Search Methods 


Logic & Resolution 


Uncertainty Reasoning 


Probabilistic reasoning 
Bayes’ theorem 

Belief networks 
Dempster-Shafer theory 


Fuzzy inference 
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Diagnosis 
ij Logic rules or .....?? 


Introduction to 
Bayesian Methods 


Diagnosis 


vx3y [Symptom(x) = Disease(y)] 

[Symptom(toothache) = Disease(Cavity)] Ў 

[Symptom(toothache) — Disease(Cavity) v 
Disease(impacted wisdom tooth) v 


Disease(gingivitis) v 


Logic fails due to 
" Laziness 


= Theoretical/practical ignorance 


Basic Probability - 1 


Simple rules and axioms of classical probability 


Uncertain evidence arises because of 

e INCOMPLETE -  missing/unavailable data 
* INEXACT - poorly measured data 

* IMPRECISE - random statistics 


Probability usually defined as 


# desired outcomes 


e Prob() = 
total # outcomes 


e.g. Find probability of drawing a heart from pack of cards 
# hearts = 13 P(heart) = 
# cards =52 


Basic Probability - 2 


Previous example utilises the axioms .... 
* All probabilities lie between 0 and 1 
OsP()s1 


* True (i.e. valid) propositions have P() z 1 and 
false (i.e. unsatisfiable) propositions, P() = 0 


P(True) = 1 P(False) = 0 
* Probability of a disjunction is given by 


P(A v B) - P(A) « P(B) — P(A ^ B) 


Basic Probability - 3 


Simple rules and axioms of classical probability 


Further properties can be derived from the 
previous axioms, for example 


P(Av ^A) = P(A) + PCA) – P(A A ^A) 
P(True) = P(A) + Р(-А) — P(False) 
1 = P(A) + PCA) 
PCA) = 1 – P(A) 


Basic Probability - 4 


Conditional Probability 


an PR 
unconditional `N conditional 
- 2 a - 
prior P(A) „ Y posterior P(H |E) 


? ВА ~» S 
hypothesis symptom 
Conditional probability of H occurring, given evidence E is 


P(HAE) 


P(H|E)-  — ———— provided P(E)>0 
P(E) 


ie. P(HAE) = P(HE) P(E) |PRODUCT RULE 


P(E AH) = P(EIH) Р(Н) 


The Joint Probability Distribution 


X,) is a 1D vector of probabilities for the 
possible values of the variable X;. 


The joint probability distribution assigns 
probabilities to all propositions in the domain. 


Then the joint is an N-D table with a value in every 
cell giving the probability of that specific state 
occurring. 


BENI Toothache | ^Toothache 


Cavity 0.04 
^Cavity 0.01 


0.89 
P(T) 1-P(T) 


Analysis between Sleep Stages and Sleep Poses: 


P(N2 ^ Left) = 281/1158 


Sleep Stages (55) 
24 P(B) @ 
А 9 P(Left)- P(579/1158) 
" 36 P(Prone) М = Y P(Left A 55) 
0 114 P(Right)-8 
= n 
a , т 
E (579)P(Left) и 
vi 
405 P(Supine) 


T 195 98 194  -1158 
E 


P i 2) РМЗ) кїм "AW сууу 


УМО) = 587/1158 Total of each SS Conditional probability of H occurring, given evidence E is 


P(HAE 
+ РНЕ) = gis 


„provided P(E)>0 


ie. P(HAE)= P(HIE)P(E) | PRODUCTRULE 


P(E a H) = P(EH) P(H) 


From S Mahvash et al, University of Surrey, 2020 
Published in IEEE Trans Biomed Eng 


Sleep Poses 


Analysis between Sleep Stages and Sleep Poses: 


Probability of Sleep Stage given a Sleep Pose , 

Empty Bed 0.9 

0.8 

Prone 0.7 
Right 
Left 
Supine 

N1 N2 N3 REM Awake 
Sleep Stages 


Probability of Sleep Pose given a Sleep Stage 


Right 


Sleep Poses 


P_LN3 


Left 0.09499 0.1693 0.1416 


P_SN3 P_SRE P_SA 
Supine 0.1975 0 0.1333 0.1 
0 
N1 N2 N3 REM Awake 


Sleep Stages 


End of segment 1 


Bayes Rule 


After Reverend Thomas Bayes (1702-1761). 
Used as a cornerstone for Al reasoning systems since 1960s, 
especially in medical diagnosis (e.g. MYCIN). 


Recall 2 forms of product rule 


* ie. P(HAE) = P(H |E) P(E) 
P(EAH) = P(E |H) P(H) 


Equate 2 RHS's and - P(E)... 


P(H|E) = AERE: Bayes’ Rule 
P(E) 


Example: Meningitis Problem 


Consider calculating the probability of meningitis, given 


the symptoms of a stiff neck..... 


Background information : 
* Half of all meningitis patients have a stiff neck; 
* Incidence of meningitis = 1 / 50,000 


* Incidence of a stiff neck = 1/20. 


What is the probability of meningitis, given a stiff neck? 


Let S be the proposition that the patient has a stiff neck 


Let M be the proposition that the patient has meningitis 


P(S| M) = 0.5, P(M) = 1/50k, P(S) = 1/20 


Р(М |5) = ба M (Bayes' Rule) 
P(S) 
0.5 1/50k 


1/20 


= 0.0002 


Normalisation 


sometimes, exact knowledge of the prior (e.g. P(S) in 


last example) may be unknown or difficult to evaluate 


Use Normalisation : 


First write: P(HE) TRATA 


P(E) 


Also: P(EI=H) P(A 
Bde (EnH) PH) 
P(E) 


Adding 1 and 2 gives: 
P(EH) P( + Р-Н) POH 
P(E) 


Р(-НЕ)+ P(HE) = 


But: P(4H|E) P(H|E) = 1 
So using RHS of equation 3... 
P(E) = P(EIH) P(A) + P(EISH) POR) 
Substitute back into equation 1 (Bayes’ Rule): 
P(E|H) P(A) 


P(EI)P(H) + Р(ЕЪН)Р(-Н) 
= P(EAH) + P(EA=H) 


Normalisation 


Thus determining the extra conditional probability 
allows us to avoid evaluating the prior P(E) 


directly. In general we can write: 


P(HE) = | a P(EH) P(A) 


where C. is a normalisation constant needed to ensure 
that entries in the joint P( H| E) sum to 1. 


Bayesian Updating Rule 


When calculating probabilities, we are often faced with 
the need to update our estimates in the light of new 
evidence. 


Bayesian updating rule : 


P(HE, E?) = a P(A) Р(Е;Н) Р(ЕДН) 


where C. is a normalisation constant needed to ensure 
that entries іп the joint P(H|E;, E2) sumto 1. 


Assumption: 

This relationship only holds true when conditional independence is true. 

i.e. We assume that, in this particular case, the conditional probability 

P(E,|H AE») does not depend on Е. Similarly, we assume that the conditional 
probability P(E.|H AE,) does not depend on E}. 


Formally we can write: 


P(E,|H ^E;) = P(E,H) 
P(E4H AE) = P(EJH) 


In general we can write: 


P(H|E,E, Ep) = a P(A) NEA 


k=l 


Ideal Bayesian Decision-Making 
system 


BAYES’ Conditional 
RULE Probabilities 


Posterior 


Probabilities 


Minimum Risk 
Decision Maker 


Realistic Bayesian 
Decision-Making System 


Bayesian Inference Network 


Ideal Bayesian system unattainable — incomplete 


knowledge; 
Use heuristic modelling tools to ‘fill the gaps’ in 


knowledge base. 


Simplistically 


Symptoms | Intermediate — | Diseases 
assertions 


Realistic Bayesian 
Decision-Making System 


Steps in design of Inference Network 


Input evidence 

Decision alternatives 

Intermediate assertions 

Inference limits 

Tune probabilities / inference function 


Example: The Car Doctor -1 


Symptoms : 
clanking engine noise 
car low on pick-up 
poor starting 
parts difficult to obtain 


What is the truth of 


e C1 the repairs costs over £1000 
- difficult to infer C1 directly from 51....54 


3 'First level' hypotheses : 
H1 broken con rod 
H2 worn camshaft 
H3 car is out of tune 


Realistic Bayesian 
Decision-Making System 


Example: The Car Doctor -2 


2 ‘Second level’ hypotheses : 
° H4 need to replace engine 


e H5 engine needs retuning 


In designing this system we would first ask our human expert a large set 

of questions, so that as the Al expert, we can better understand his/her 
decision-making process. We might ask schematically what evidence 

is used, in our simple case this is represented as S1 - S4, and what inferences 
can be made from this evidence, H1-H5, before linking this to a prognosis or 
outcome, C1. 


Realistic Bayesian 
Decision-Making System 


Example: The Car Doctor - 3 


Recall from the Logic lectures, that when we have a set of propositions, 
lets call them A, and B, then we can link these together into sentences 
as conjuncts or disjuncts using logical connectives such as AND, OR 
and IMPLIES etc. YOU SHOULD KNOW THESE TRUTH TABLES! 


In our example here, we use ‘symptoms’ which represent our observation 
propositions, so that these are either present or not present, l.e. either 
asserted or not asserted. 


We will shortly use Bayes' Theorem to move these to posterior probabilities. 
In other words, given a set of priors (frequency of a given symptom), we 

can take the known probability, say, P(SIH1) and from this compute Р(Н\ |5), 
which is the probability of finding a broken con rod given evidence of some 
combination of symptoms such as poor starting, noisy engine etc. 


However, we must first fill in some blanks on the inference network..... 


Realistic Bayesian 
Decision-Making System 


Example: The Car Doctor - 4 


Our next problem is to try to deal with intermediate assertions. 
These represent states which are intermediate between 
diagnosing the engine fault, and finding the prognosis or 
outcome (in our case that the engine repair cost will be >£1000). 


There may be no clear way in which our expert makes these 
inferences. However we need to somehow model the experts 
rather ‘blurry’ or ‘fuzzy’ decision making process. We therefore 
resort to using fuzzy inference rules. These have been shown, 
empirically, to imitate the way in which humans make inferences 
given rather imprecise evidence. 


Fuzzy inference rules, in general, take a set of probabilities, n, 
as arguments and maps them onto a single probability value, 
using some function,f. The appropriate choice of function 
depends on the particular application. 


Formally, we can write  f:[0,1]" — [0,1]. 


There are 2 common inference rules which have been used in 
the past with some success in different applications. These two 
rules are sometimes referred to as the possibilistic fuzzy 
inference rule and the probabilistic fuzzy inference rule. We'll 
add these onto the previous table now.... 


Realistic Bayesian 
Decision-Making System 


Example: The Car Doctor - 5 


A,B | АВ AB АФВ 


mu ab | 1-a«ab | Xor(a,b) 


Se ice 


A and B considered dependent 
A and B considered independent 


XOR means - ‘whenever A or B occurs, but never both’. We 
can represent this as a fuzzy inference rule in 2 possible 
ways: 

Possibilistic:  xor(a,b) = max[min(a,1-b), min(1-a,b)] 


Probabilistic: Xor(a,b) = a + b - 2ab + a?b + ab? — a?b? 


Realistic Bayesian 
Decision-Making System 


Example: The Car Doctor -6 


S1 S2 S3 (SHI) „P(SIH2) P(S|H3) 
P P(H3)=0.1 


F 
F 
F 
F 
T 
T 
T 
T 


чы атташ атт 
чт ч тУт ч т 


Following оп from page 3 of the problem, we now take the basic set of 
propositions, and using data acquired from, say, national car service 
centres, and/or our human experts opinions, we can produce a set of 
priors, and conditional probabilities. Note each column sums to 1. 


Next we’ll systematically apply Bayes’ rule to compute the posterior 
probabilities, and use for this application, our possibilistic inference rule 
to make intermediate inferences. We'll then show that we have all the 
information needed to compute our conclusion or outcome, C1. 


Realistic Bayesian 
Decision-Making System 


Example: The Car Doctor -7 


| 


C1: What is the truth of the statement : 
The estimated repair costs will be over >£1000 
From previously shown inference net - 
(a) H; =H; v Н, (b) Hs = —(H, v H2) A Нз 


For our problem we will consider the particular case that all symptoms are 
present. i.e. Syo3=T. 


Now evaluate P(H,|S), using (a) ...... 
P(H, | S423) = max [P(H,|Sjo3), P(H3]9123)] P(S | (H,) P(H, ) 
P(S) 
0.63 0.0001 
0.0002 


P(H,|S) = 0.315 


Similarly use Bayes Rule to compute P(H.|S.23) 


e.g. first use Bayes’ rule to find P(H,| S) = 


Realistic Bayesian 
Decision-Making System 


Example: The Car Doctor -8 


== 


similarly P(H,|S) = 0.125 


Now  P(H,|S) = max [P(H,| S), P(H,| S)] 
= max [0.315, 0.125] 
= 0.315 


Using (b) on last page and calculating P(H,|S) results in 
P(H.|S) = 0.5 Exercise: try calculating this yourself. 
Now P(C,|S) = Н, v [H5 AS, ] 
= max [P(H,|S), min(P(H4|S),v)] 
= max [0.315, min (0.5, 1)] - if S, is true, v» 1 
z 0.5 


- so in 50% of cases when all symptoms S1-S4 
are present, then the cost of repair will be >£1000 


Odds & Bayes Rule - 1 


Experts often give unreliable probabilities - much 
better to reformulate problem in terms of odds 


Certain Probabilistic 
evidence rule 


P(EIH) P(H) 
P(E) 


P(H|E) - 


P(E|-H) P(-H) 
P(E) 


P(-H |E) 


Define 'Odds' as: P(x) 
ОФ) = т-вы) 


So, if we divide eq(1) by likelihood ratio 
eq (2) we can write: 
- O(H) = AO(H 
Odds, O(H|E) = (Н) ) prior odds 


on H 


Odds & Bayes Rule - 2 


Likelihood ratio, sufficiency and necessity 


Indicates how presence or absence of evidence influences odds on hypothesis 


rn, A’ TN 
Certain Probabilistic 
evidence rule 


À»1 — presence of evidence reinforces belief in H 
1=0  — reduces belief in Н 
- sufficiency coefficient 


However, if E is false or known not to be present, then 
О(Н |-Е) = 2'O(H) where - A’ necessity coefficient 


Р(-Е|Н) - 1 - P(E|H) 
P(-E|-H) 1 - P(E|-H) 


М = 


Summary / Conclusions 


Probabilistic vs Logical reasoning 
> If | have toothache, then | have a cavity; 
> IfI have toothache, then there's a 95% 
probability its a cavity. 
Bayes’ rule used for diagnosis because.... 
> P(H|E)is what experts (e.g. doctors) do - difficult; 


» P(EH)is often easier to determine, as are P(E), P(H) 


» Werarely have all the information needed for just using 
Bayes' Rule. We have to make modelling decisions to 
represent the human decision making process. 


» Use heuristics to assist problem. One way to achieve 
this is to use Fuzzy Inference Rules, to infer some 
‘degree of truth’ about a proposition al level k+1, from 
substantiated hypotheses made at level k. 


» Bayesian networks/classifiers used for more complex 
problems. 


